A precise high-dimensional asymptotic theory for boosting and minimum-?1-norm interpolated classifiers

نویسندگان

چکیده

This paper establishes a precise high-dimensional asymptotic theory for boosting on separable data, taking statistical and computational perspectives. We consider setting where the number of features (weak learners) p scales with sample size n, in an overparametrized regime. Under class models, we provide exact analysis generalization error when algorithm interpolates training data maximizes empirical ?1-margin. Further, explicitly pin down relation between test optimal Bayes error, as well proportion active at interpolation (with zero initialization). In turn, these characterizations answer certain questions raised (Neural Comput. 11 (1999) 1493–1517; Ann. Statist. 26 (1998) 1651–1686) surrounding boosting, under assumed generating processes. At heart our lies in-depth study maximum-?1-margin, which can be accurately described by new system nonlinear equations; to analyze this margin, rely Gaussian comparison techniques develop novel uniform deviation argument. Our arguments handle (1) any finite-rank spiked covariance model feature distribution (2) variants corresponding general ?q-geometry, q?[1,2]. As final component, via Lindeberg principle, establish universality result showcasing that scaled ?1-margin (asymptotically) remains same, whether covariates used arise from random or appropriately linearized matching moments.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosting classifiers for drifting concepts

This paper proposes a boosting-like method to train a classifier ensemble from data streams. It naturally adapts to concept drift and allows to quantify the drift in terms of its base learners. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the...

متن کامل

Multiclass Boosting for Weak Classifiers

AdaBoost.M2 is a boosting algorithm designed for multiclass problems with weak base classifiers. The algorithm is designed to minimize a very loose bound on the training error. We propose two alternative boosting algorithms which also minimize bounds on performance measures. These performance measures are not as strongly connected to the expected error as the training error, but the derived bou...

متن کامل

Rank 72 high minimum norm lattices

Given a polarization of an even unimodular lattice and integer k ≥ 1, we define a family of unimodular lattices L(M, N, k). Of special interest are certain L(M,N, 3) of rank 72. Their minimum norms lie in {4, 6, 8}. Norms 4 and 6 do occur. Consequently, 6 becomes the highest known minimum norm for rank 72 even unimodular lattices. We discuss how norm 8 might occur for such a L(M,N, 3). We note ...

متن کامل

Class-imbalanced classifiers for high-dimensional data

A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A class-imbalanced classifier typically modifies a ...

متن کامل

Boosting Classifiers Regionally

This paper presents a new algorithm for Boosting the performance of an ensemble of classifiers. In Boosting, a series of classifiers is used to predict the class of data where later members of the series concentrate on training data that is incorrectly predicted by earlier members. To make a prediction about a new pattern, each classifier predicts the class of the pattern and these predictions ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Annals of Statistics

سال: 2022

ISSN: ['0090-5364', '2168-8966']

DOI: https://doi.org/10.1214/22-aos2170